First and foremost, I will make a quickly analyse of brazilian market. This analyse will help me to choose which direction will be better to follow, as I have a short time to try as many way possible. In the firs moment we just have done a data mining to preprer the data for forecast analyse.
Remark about how I create group of Avarage Pack Size, I used a boxplot to set the size of each group.
The first graphic bellow ,volume and value normalized, helps us to undestand the moviment of market. In 2016, the volume market sharply drecrease more than the value market, even drecrease less than before as you can note the blue line slope. One of my issue after June 2016 is the price may impact the volume, the red line is almost flat.
For a confirm conclusion about the price, we can note in the 2 chart beloow the size of point which mean the price, did not change pasting time.
A quick undestand which Flavor has more impact in the market, easly notice in the graphic bellow the Flavor Milk Chocolate has a huge impact in the market result. So could be interst do different analyses, a cluster of Milk Chocolate and a cluster of another Flavors. Spliting in two cluster and zoom the second one, we can see cherry flavor follow by coffe flavor are most predominant.
Now, you can check witouth be normalized. I had to use 3 different graph because the range of values is large, so if try to plot in a single chart, woulb clear the information.
Analysing the material market, as easy to realise the plastic predominant, the clomun chart on the left is the value market and on the right volume market by material package. In the end of 2017 seems exist a inverse correlation or just a casuality because the campaing agains plastics are inrease nowadays.
The caloric content always have been dominated by sugar, the graphic bellow can prove this sentence.
The market has two type of size package that lead the other.
After understand the market, we can set the categories more influencer for run the forecast model, is usual help in the computer performance because will work with less data.
Another way to understand the table is plotting in a line chart by year. The drecreasing moviment is not visible just in the chart along months, but is visible if we look at y axis. The same information that is in the table above, is confirm in the charts.
There are 2 facts more that could have strong influence in the market, Coverage and Shelf Life. Was calculated the mean by Year and Month, as you can see, the charts the coverage is decreasing since January 2015 so may be one of factor which has strong influence in the Market. The shelf life, in overall, keep in the same range and is clear to notice the sazonality in the end of the year. The histogram and density chart help us to detect any outlier which could mess up the mean result.
For help us to identify the strong correlation in the variables, the two chart bellow will clarify all issue which could appear. We can conclued